Skip to content

F-012: perf(services): cache compiled built-in secret regex patterns#16

Open
Sephyi wants to merge 1 commit intodevelopmentfrom
audit/f-012-cache-secret-patterns
Open

F-012: perf(services): cache compiled built-in secret regex patterns#16
Sephyi wants to merge 1 commit intodevelopmentfrom
audit/f-012-cache-secret-patterns

Conversation

@Sephyi
Copy link
Copy Markdown
Owner

@Sephyi Sephyi commented Apr 22, 2026

Summary

perf(services): cache compiled built-in secret regex patterns.

Audit context

Closes audit entry F-012 from #3.

Verification

  • cargo fmt --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --all-targets

Note: one pre-existing test porcelain_exits_within_timeout_with_no_staged_changes is a known macOS cold-start flake that reproduces on unmodified development — unrelated to this change.

`build_patterns()` is called every time the configured custom/disabled
secret-pattern lists change, and previously re-ran `builtin_patterns()`
which compiled all 24 built-in regexes from scratch on each call.

Cache the built-in `Vec<SecretPattern>` in a `LazyLock` so the 24
regexes are compiled exactly once per process. `build_patterns()` now
clones cheap entries (Regex uses internal Arc sharing; the string
fields are `Cow<'static, str>`) from the cached slice rather than
recompiling. The filter semantics for `disabled_secret_patterns` are
preserved.

This also consolidates the separate `DEFAULT_PATTERNS` `LazyLock`
into the single `BUILTIN_PATTERNS` cache; `scan_for_secrets` and
`scan_full_diff_for_secrets` now reference it via `builtin_patterns()`
which returns a `&'static [SecretPattern]`.

The public API of `build_patterns()` is unchanged.

Closes audit entry F-012 from #3.
Copilot AI review requested due to automatic review settings April 22, 2026 19:50
@Sephyi Sephyi added the audit Codebase audit cleanup (issue #3) label Apr 22, 2026
@Sephyi Sephyi self-assigned this Apr 22, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses audit finding F-012 by avoiding repeated compilation of the built-in secret-scanning regex patterns, improving performance for repeated secret scans.

Changes:

  • Cache the 24 built-in SecretPattern regexes in a process-wide LazyLock and expose them via a slice.
  • Make SecretPattern Clone so callers can cheaply clone cached patterns when building a mutable pattern list.
  • Update default secret-scan entrypoints to use the cached built-in patterns directly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/services/safety.rs
Comment on lines +43 to +51
let mut patterns: Vec<SecretPattern> = if disabled.is_empty() {
builtin.to_vec()
} else {
let disabled_lower: Vec<String> = disabled.iter().map(|s| s.to_lowercase()).collect();
patterns.retain(|p| !disabled_lower.contains(&p.name.to_lowercase()));
}
builtin
.iter()
.filter(|p| !disabled_lower.contains(&p.name.to_lowercase()))
.cloned()
.collect()
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disabled-pattern filtering path does repeated Vec::contains lookups and recomputes p.name.to_lowercase() for every built-in pattern. Consider using a HashSet<String> for the lowercased disabled names to avoid repeated linear scans/allocations (even if the built-in set is small today).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

audit Codebase audit cleanup (issue #3)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants